Optimizing the Chase: Scalable Data Integration under Constraints

نویسندگان

  • George Konstantinidis
  • José Luis Ambite
چکیده

We are interested in scalable data integration and data exchange under constraints/dependencies. In data exchange the problem is how to materialize a target database instance, satisfying the source-totarget and target dependencies, that provides the certain answers. In data integration, the problem is how to rewrite a query over the target schema into a query over the source schemas that provides the certain answers. In both these problems we make use of the chase algorithm, the main tool to reason with dependencies. Our first contribution is to introduce the frugal chase, which produces smaller universal solutions than the standard chase, still remaining polynomial in data complexity. Our second contribution is to use the frugal chase to scale up query answering using views under LAV weakly acyclic target constraints, a useful language capturing RDF/S. The latter problem can be reduced to query rewriting using views without constraints by chasing the source-to-target mappings with the target constraints. We construct a compact graph-based representation of the mappings and the constraints and develop an efficient algorithm to run the frugal chase on this representation. We show experimentally that our approach scales to large problems, speeding up the compilation of the dependencies into the mappings by close to 2 and 3 orders of magnitude, compared to the standard and the core chase, respectively. Compared to the standard chase, we improve online query rewriting time by a factor of 3, while producing equivalent, but smaller, rewritings of the original query.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chasing Constrained

We investigate the implication problem for constrained tuple-generating dependencies (CTGDs), the extension of tuple-and equality-generating dependencies that permits expression of semantic relations (constraints) on variables. The implication problem is central to identifying redundant integrity constraints, checking integrity constraints on constraint databases, detecting the independence of ...

متن کامل

Reliability optimization problems with multiple constraints under fuzziness

In reliability optimization problems diverse situation occurs due to which it is not always possible to get relevant precision in system reliability. The imprecision in data can often be represented by triangular fuzzy numbers. In this manuscript, we have considered different fuzzy environment for reliability optimization problem of redundancy. We formulate a redundancy allocation problem for a...

متن کامل

Optimizing Cluster Heads for Energy Efficiency in Large-Scale Heterogeneous Wireless Sensor Networks

Many complex sensor network applications require deploying a large number of inexpensive and small sensors in a vast geographical region to achieve quality through quantity. Hierarchical clustering is generally considered as an efficient and scalable way to facilitate the management and operation of such large-scale networks and minimize the total energy consumption for prolonged lifetime. Judi...

متن کامل

Semantic Constraint and QoS-Aware Large-Scale Web Service Composition

Service-oriented architecture facilitates the running time of interactions by using business integration on the networks. Currently, web services are considered as the best option to provide Internet services. Due to an increasing number of Web users and the complexity of users’ queries, simple and atomic services are not able to meet the needs of users; and to provide complex services, it requ...

متن کامل

Stop the Chase

The chase procedure, an algorithm proposed 25+ years ago to fix constraint violations in database instances, has been successfully applied in a variety of contexts, such as query optimization, data exchange, and data integration. Its practicability, however, is limited by the fact that – for an arbitrary set of constraints – it might not terminate; even worse, chase termination is an undecidabl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2014